List of AI News about adversarial threats
| Time | Details |
|---|---|
|
2025-12-09 19:47 |
AI Security Study by Anthropic Highlights SGTM Limitations in Preventing In-Context Attacks
According to Anthropic (@AnthropicAI), a recent study on Secure Gradient Training Methods (SGTM) in AI was conducted using small models within a simplified environment and relied on proxy evaluations instead of established benchmarks. The analysis reveals that, similar to conventional data filtering, SGTM is ineffective against in-context attacks where adversaries introduce sensitive information during model interaction. This limitation signals a crucial business opportunity for developing advanced AI security tools and robust benchmarking standards to address real-world adversarial threats (source: AnthropicAI, Dec 9, 2025). |